Chinese Word Similarity Computing Based on Combination Strategy

نویسندگان

Shaoru Guo

Yong Guan

Ru Li

Qi Zhang

چکیده

Chinese word similarity computing is a fundamental task for natural language processing. This paper presents a method to calculate the similarity between Chinese words based on combination strategy. We apply Baidubaike to train Word2Vector model, and then integrate different methods, Dictionarybased method, Word2Vector-based method and Chinese FrameNet (CFN)based method, to calculate the semantic similarity between Chinese words. The semantic Dictionary-based method includes dictionaries such as HowNet, DaCilin, Tongyici Cilin (Extended) and Antonym. The experiments are performed on 500 pairs of words and the Spearman correlation coefficient of test data is 0.524, which shows that the proposed method is feasible and effective.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Research of Chinese Semantic Similarity Calculation Introduced Punctuations

So far, most Chinese natural language processing neglects the punctuations or oversimplifies their functions. To improve the efficiency of Chinese similarity computing, this paper gives a Chinese similarity computing system model in accordance with the problems of Chinese sentence similarity computation aspect. This model is a combination of punctuations and traditional similarity computing. Co...

متن کامل

Combining a Chinese Thesaurus with a Chinese Dictionary

Abs t rac t In this paper, we study the problem of combining a Chinese thesaurus with a Chinese dictionary by linking the word entries in the thesaurus with the word senses in the dictionary, and propose a similar word strategy to solve the problem. The method is based on the definitions given in the dictionary, but without any syntactic parsing or sense disambiguation on them at all. As a resu...

متن کامل

Semantic Similarity Computation Based on Multi-feature Combination using HowNet

Semantic similarity between words is becoming a generic problems for many applications of computational linguistics and artificial intelligence. The difficulty lies in how to develop a computational method that is capable of generating satisfactory results close to how humans perceive. This paper proposes a semantic similarity approach that is based on multi-feature combination. One of the benc...

متن کامل

Word Similarity Computing Based on Hybrid Hierarchical Structure by HowNet

Word similarity computing is one of the most important and fundamental task in the field of natural language processing. Most of word similarity methods perform well in synonyms, but not well between words whose similarity is vague. It confronts the challenge of how to overcome this problem. An approach is proposed to compute Chinese word similarity based on hybrid hierarchical structure by How...

متن کامل

基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net)

Word similarity is broadly used in many applications, such as information retrieval, information extraction, text classification, word sense disambiguation, example-based machine translation, etc. There are two different methods used to compute similarity: one is based on ontology or a semantic taxonomy; the other is based on collocations of words in a corpus. As a lexical knowledgebase with ri...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Chinese Word Similarity Computing Based on Combination Strategy

نویسندگان

چکیده

منابع مشابه

The Research of Chinese Semantic Similarity Calculation Introduced Punctuations

Combining a Chinese Thesaurus with a Chinese Dictionary

Semantic Similarity Computation Based on Multi-feature Combination using HowNet

Word Similarity Computing Based on Hybrid Hierarchical Structure by HowNet

基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net)

عنوان ژورنال:

اشتراک گذاری